One- to Four-Dimensional Kernels for Virtual Screening and the Prediction of Physical, Chemical, and Biological Properties
نویسندگان
چکیده
Many chemoinformatics applications, including high-throughput virtual screening, benefit from being able to rapidly predict the physical, chemical, and biological properties of small molecules to screen large repositories and identify suitable candidates. When training sets are available, machine learning methods provide an effective alternative to ab initio methods for these predictions. Here, we leverage rich molecular representations including 1D SMILES strings, 2D graphs of bonds, and 3D coordinates to derive efficient machine learning kernels to address regression problems. We further expand the library of available spectral kernels for small molecules developed for classification problems to include 2.5D surface and 3D kernels using Delaunay tetrahedrization and other techniques from computational geometry, 3D pharmacophore kernels, and 3.5D or 4D kernels capable of taking into account multiple molecular configurations, such as conformers. The kernels are comprehensively tested using cross-validation and redundancy-reduction methods on regression problems using several available data sets to predict boiling points, melting points, aqueous solubility, octanol/water partition coefficients, and biological activity with state-of-the art results. When sufficient training data are available, 2D spectral kernels in general tend to yield the best and most robust results, better than state-of-the art. On data sets containing thousands of molecules, the kernels achieve a squared correlation coefficient of 0.91 for aqueous solubility prediction and 0.94 for octanol/water partition coefficient prediction. Averaging over conformations improves the performance of kernels based on the three-dimensional structure of molecules, especially on challenging data sets. Kernel predictors for aqueous solubility (kSOL), LogP (kLOGP), and melting point (kMELT) are available over the Web through: http://cdb.ics.uci.edu.
منابع مشابه
The Prediction of Thermo Physical, Vibrational Spectroscopy, Chemical Reactivity, Biological Properties of Morpholinium Borate, Phosphate, Chloride and Bromide Ionic Liquid: A DFT Study
In the light of computational chemistry, based on morpholinium cation-based Ionic Liquid, their different types of physical, chemical, and biological properties is highlighted. The physical properties are evaluated through the Density Functional Theory (DFT) of Molecular Mechanics and also examine the chemical and biological properties. The difference between Highest Occupied Molecular Orbital ...
متن کاملApplication of Artificial Neural Networks (ANN) and Image Processing for Prediction of Gravimetrical Properties of Roasted Pistachio Nuts and Kernels
Roasting is among the most common methods of nut processing causing physical and chemical changes and ultimately increasing overall acceptance of the product. In this research, the effects of temperature (90, 120 ,and 150°C), time (20, 35 ,and 50 min) ,and roasting air velocity (0.5, 1.5 ,and 2.5 m/s) on gravimetrical properties of pistachio nuts and kernels including unit mass, true density, o...
متن کاملEffect of different poultry wastes on physical, chemical and biological properties of soil
The effect of poultry waste application on physicochemical and biological properties of sandy clay loam soil was investigated on a 7m X 7m plot of land. Plot was divided into four portions and 7.5kg each of broiler, cockerel and layers waste was applied to plot A, B, and C in slurry form while plot D was used as control(no application) for eight weeks with two weeks interval. After the fourth a...
متن کاملQuantitative Structure-Property Relationship to Predict Quantum Properties of Monocarboxylic Acids By using Topological Indices
Abstract. Topological indices are the numerical value associated with chemical constitution purporting for correlation of chemical structure with various physical properties, chemical reactivity or biological activity. Graph theory is a delightful playground for the exploration of proof techniques in Discrete Mathematics and its results have applications in many areas of sciences. A graph is a ...
متن کاملApplication of Artificial Neural Networks (ANN) and Image Processing for Prediction of the Geometrical Properties of Roasted Pistachio Nuts and Kernels
Roasting is the most common way for pistachio nuts processing, and the purpose of that was to increase the products total acceptability. Purpose of this study was to investigate the effect of temperature (90, 120 and 150°C), time (20, 35 and 50 min), and roasting air velocity (0.5, 1.5 and 2.5 m/s) on geometrical attributes of pistachio nuts and kernels including principle dimensions, shape fac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of chemical information and modeling
دوره 47 3 شماره
صفحات -
تاریخ انتشار 2007